Skip to content

fix(p2p): penalize peers for errors during response reading#21680

Merged
mralj merged 1 commit intomerge-train/spartanfrom
palla/fix-oversized-snappy-no-penalty
Mar 17, 2026
Merged

fix(p2p): penalize peers for errors during response reading#21680
mralj merged 1 commit intomerge-train/spartanfrom
palla/fix-oversized-snappy-no-penalty

Conversation

@spalladino
Copy link
Contributor

@spalladino spalladino commented Mar 17, 2026

Motivation

Errors during readMessage (oversized snappy responses, corrupt data, etc.) were caught and silently converted to { status: UNKNOWN } return values instead of re-throwing. Since sendRequestToPeer only calls handleResponseError in its own catch block, none of these errors resulted in peer penalties. The request was simply retried with another peer, allowing a malicious peer to waste bandwidth indefinitely.

Approach

Re-throw non-protocol errors from readMessage so they propagate to sendRequestToPeer's catch block where handleResponseError applies peer penalties. Additionally, introduce a dedicated OversizedSnappyResponseError class so oversized responses get a harsher LowToleranceError penalty (score -50, banned after 2 offenses) instead of falling through to the generic HighToleranceError catch-all.

Changes

  • p2p (reqresp): Changed readMessage catch block to only return status for ReqRespStatusError and re-throw all other errors, so they reach handleResponseError for penalization
  • p2p (encoding): Added OversizedSnappyResponseError class for explicit categorization
  • p2p (reqresp): Added OversizedSnappyResponseError handling in categorizeResponseError with LowToleranceError severity

@spalladino spalladino changed the title fix(p2p): penalize peers sending oversized snappy responses fix(p2p): penalize peers for errors during response reading Mar 17, 2026
Errors in readMessage (invalid status bytes, oversized snappy responses,
corrupt data) were caught and silently converted to UNKNOWN status returns.
Since sendRequestToPeer only calls handleResponseError in its own catch
block, none of these errors resulted in peer penalties. The request was
simply retried with another peer, allowing a malicious peer to waste
bandwidth indefinitely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@spalladino spalladino force-pushed the palla/fix-oversized-snappy-no-penalty branch from 14a8239 to 34ee6a4 Compare March 17, 2026 15:21
@AztecBot
Copy link
Collaborator

Flakey Tests

🤖 says: This CI run detected 1 tests that failed, but were tolerated due to a .test_patterns.yml entry.

\033FLAKED\033 (8;;http://ci.aztec-labs.com/7abc45c6251d4cb3�7abc45c6251d4cb38;;�):  yarn-project/end-to-end/scripts/run_test.sh simple src/e2e_p2p/duplicate_attestation_slash.test.ts (148s) (code: 0) group:e2e-p2p-epoch-flakes

Copy link
Contributor

@mralj mralj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@spalladino I cannot find anything wrong with the reasoning/implementation - I think this is good callout

@mralj mralj merged commit b32d86f into merge-train/spartan Mar 17, 2026
11 checks passed
@mralj mralj deleted the palla/fix-oversized-snappy-no-penalty branch March 17, 2026 16:07
AztecBot pushed a commit that referenced this pull request Mar 17, 2026
## Motivation

Errors during `readMessage` (oversized snappy responses, corrupt data,
etc.) were caught and silently converted to `{ status: UNKNOWN }` return
values instead of re-throwing. Since `sendRequestToPeer` only calls
`handleResponseError` in its own catch block, none of these errors
resulted in peer penalties. The request was simply retried with another
peer, allowing a malicious peer to waste bandwidth indefinitely.

## Approach

Re-throw non-protocol errors from `readMessage` so they propagate to
`sendRequestToPeer`'s catch block where `handleResponseError` applies
peer penalties. Additionally, introduce a dedicated
`OversizedSnappyResponseError` class so oversized responses get a
harsher `LowToleranceError` penalty (score -50, banned after 2 offenses)
instead of falling through to the generic `HighToleranceError`
catch-all.

## Changes

- **p2p (reqresp)**: Changed `readMessage` catch block to only return
status for `ReqRespStatusError` and re-throw all other errors, so they
reach `handleResponseError` for penalization
- **p2p (encoding)**: Added `OversizedSnappyResponseError` class for
explicit categorization
- **p2p (reqresp)**: Added `OversizedSnappyResponseError` handling in
`categorizeResponseError` with `LowToleranceError` severity
@AztecBot
Copy link
Collaborator

✅ Successfully backported to backport-to-v4-next-staging #21654.

github-merge-queue bot pushed a commit that referenced this pull request Mar 18, 2026
BEGIN_COMMIT_OVERRIDE
fix(p2p): fall back to maxTxsPerCheckpoint for per-block tx validation
(#21605)
chore: fixing M3 devcontainer builds (#21611)
fix: clamp finalized block to oldest available in world-state (#21643)
chore: fix proving logs script (#21335)
fix: (A-649) tx collector bench test (#21619)
fix(validator): process block proposals from own validator keys in HA
setups (#21603)
fix: add bounds when allocating arrays in deserialization (#21622)
fix: skip handleChainFinalized when block is behind oldest available
(#21656)
chore: demote finalized block skip log to trace (#21661)
fix: skip -march auto-detection for cross-compilation presets (#21356)
chore: revert "add bounds when allocating arrays in deserialization"
(#21622) (#21666)
fix: capture txs not available error reason in proposal handler (#21670)
fix: estimate gas in bot and make BatchCall.simulate() return
SimulationResult (#21676)
fix: prevent HA peer proposals from blocking equivocation in duplicate
proposal test (#21673)
fix(p2p): penalize peers for errors during response reading (#21680)
feat(sequencer): add build-ahead config and metrics (#20779)
chore: fixing build on mac (#21685)
fix: HA deadlock for last block edge case (#21690)
fix: process all contract classes in storeBroadcastedIndividualFunctions
(A-683) (#21686)
chore: add slack success post on nightly scenario (#21701)
fix(builder): persist contractsDB across blocks within a checkpoint
(#21520)
fix: only delete logs from rolled-back blocks, not entire tag (A-686)
(#21687)
chore(p2p): lower attestation pool per-slot caps to 2 (#21709)
chore(p2p): remove unused method (#21678)
fix(p2p): penalize peer on tx rejected by pool (#21677)
fix(test): workaround slow mock creation (#21708)
fix(sequencer): fix checkpoint budget redistribution for multi-block
slots (#21692)
fix: batch checkpoint unwinding in handleEpochPrune (A-690) (#21668)
fix(sequencer): add missing opts arg to checkpoint_builder tests
(#21733)
fix: race condition in fast tx collection (#21496)
fix: increase default postgres disk size from 1Gi to 10Gi (#21741)
fix: update batch_tx_requester tests to use RequestTracker (#21734)
chore: replace dead BOOTSTRAP_TO env var with bootstrap.sh build arg
(#21744)
fix(sequencer): extract gas and blob configs from valid requests only
(A-677) (#21747)
fix: deflake attempt for l1_tx_utils (#21743)
fix(test): fix flaky keystore reload test (#21749)
fix(test): fix flaky duplicate_attestation_slash test (#21753)
feat(pipeline): introduce pipeline views for building (#21026)
END_COMMIT_OVERRIDE
AztecBot added a commit that referenced this pull request Mar 19, 2026
BEGIN_COMMIT_OVERRIDE
feat: entrypoint replay protection (#21649)
feat: guard BoundedVec oracle returns against dirty trailing storage
(#21589)
fix: add bounds when allocating arrays in deserialization (#21622)
feat: implement manual Packable for structs with sub-Field members
(#21576)
fix(aztec-node): throw on existing nullifier in
getLowNullifierMembershipWitness (#21472)
fix: use trait dispatch for array Packable::unpack in card_game_contract
(#21683)
fix(p2p): penalize peers for errors during response reading (#21680)
fix: update nullifier non-inclusion test expectations after early oracle
throw (backport #21600) (#21615)
fix(aztec-nr): fix OOB index with nonzero offset (#21613)
fix(builder): persist contractsDB across blocks within a checkpoint
(#21520)
fix(stdlib): accept null return_type for void Noir functions (#21647)
feat: gas estimations on send (#21646)
fix(validator): process block proposals from own validator keys in HA
setups (backport #21603) (#21659)
fix(p2p): penalize peer on tx rejected by pool (#21677)
fix(sequencer): fix checkpoint budget redistribution for multi-block
slots (#21692)
feat: sync cache invalidation oracle (backport #21459) (#21730)
feat!: make AES128 decrypt oracle return Option (backport #21696)
(#21735)
feat!: include init_hash in private initialization nullifier (backport
#21427) (#21736)
fix(sequencer): extract gas and blob configs from valid requests only
(A-677) (#21747)
chore: backport #21744 — replace dead BOOTSTRAP_TO env var with
bootstrap.sh build arg (#21748)
refactor: revert remove assert_bounded_vec_trimmed (#21758)
END_COMMIT_OVERRIDE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants